Commodore 64 Scene Diskmags Assortment

home *** CD-ROM | disk | FTP | other *** search

/ Commodore 64 Scene Diskmags Assortment / Commodore_CEE_Vol._1_Issue_05_1995_Jack_Vander_White_Disk_2_of_3_Side_B.d64 / program bit 7 < prev next >

Wrap

Text File | 2023-02-26 | 18KB | 305 lines

NMI/IRQ From : George Hug Ed Bell writes: EB> The NMI/IRQ switch has nothing to do with locating the cartridge. EB> That switch is only used, as far as I know, for cP/M terminals. I know that George Hug has said in the past that God never intended for the NMI to be used, but that is the way everyone has done things as far as I've ever seen. I think the use of NMI got started with the VIC20 and C64 - because they had to process every incoming and outgoing bit individually, and had to have priority over the other stuff going on. However, using the 6551 should allow you to just use the IRQ line like any other device. Well, in fact, the +4's microprocessor (the 8502, I think) doesn't even have an NMI pin, so everything runs off IRQ. But since the +4 has the built-in 6551, it works fine. Of course at some point (14.4k maybe) I suspect you may start to push the limits of the system overall, even with a 6551. The main theoretical problem with using the 6551 on the NMI line is that the NMI is edge-triggered and (of course) non-maskable. When you read the status byte of the 6551, it re-enables the interrupt, so that some subsequent event can send you back into the NMI handler in a manner you are not expecting. But there are ways around this - you just have to be careful writing the NMI handler. --------------------------------- oVerflow ? From : msmakela@sranje.tky.hut.fi Ralph Mason <ralph.mason@liffe.com> writes: Ralph> Can someone please give me a description of how the overflow flag is set by the ADC and SBC opcodes. There dose not seem to much information about this flag. Any description or a Pseudo code description would be very helpful. There are already some descriptions posted, but as they weren't complete, here's the one from the 64doc file (which is part of the X64 emulator project). The whole file is available on the WWW, as http://www.hut.fi/{$de}msmakela/cbm/emul/x64/64doc.html. V oVerflow flag Like the Negative flag, this flag is intended to be used with 8-bit signed integer numbers. The flag will be affected by addition and subtraction, the instructions PLP, CLV and BIT, and the hardware signal -SO. Note that there is no SEV instruction, even though the MOS engineers loved to use East European abbreviations, like DDR (Deutsche Demokratische Republik vs. Data Direction Register). (The Russian abbreviation for their former trade association COMECON is SEV.) The -SO (Set Overflow) signal is available on some processors, at least the 6502, to set the V flag. This enables response to an I/O activity in equal or less than three clock cycles when using a BVC instruction branching to itself ($50 $FE). The CLV instruction clears the V flag, and the PLP and BIT instructions copy the flag value from the bit 6 of the topmost stack entry or from memory. After a binary addition or subtraction, the V flag will be set on a sign overflow, cleared otherwise. What is a sign overflow? For instance, if you are trying to add 123 and 45 together, the result (168) does not fit in a 8-bit signed integer (upper limit 127 and lower limit -128). Similarly, adding -123 to -45 causes the overflow, just like subtracting -45 from 123 or 123 from -45 would do. Like the N flag, the V flag will not be set as expected in the Decimal mode. Later in this document is a precise operation description. A common misbelief is that the V flag could only be set by arithmetic operations, not cleared. Well, that description wasn't as scientific as I thought. :-( So, here's some C code that sets the V flag after an addition: /* A +=s */ t = A + s + C; V = (t ^ A) & 128 && !((A ^ s) & 128); If you want to see how the V flag is set in Decimal mode or with the ARR instruction ($6B), refer to the 64doc file. ---- From: Asger Alstrup The overflow flag is used to detect overflow in math when you calculate with 2nd complement numbers. 2nd complement is a technique used to re- present negative numbers in binary with the property that addition and subtraction can be performed as usual giving the correct result. 2nd complement is basically a clever encoding for the numbers: 8-bit value 2nd complement interpretation 0 0 1 1 2 2 .. 127 127 128 -128 129 -127 ... 255 -1 To perform negation of a 2nd complement number, simply reverse all the bits in the number, and add 1 to the number as usual. Given a number in 2nd complement, you can use this program to get the absolute value and sign in carry: cmp #$80 bcc positive eor #$ff ;Inverse all bits adc #0 ;Carry is 1, so this will add 1 sec positive where the carry will be set for negative numbers. The other way is similar: bcc positive eor #$ff ;Inverse all bits adc #0 ;Carry is 1, so this is will add 1 positive The overflow flag in the 6502 detects carrys from bit 6 to 7 for ADC, and underflows for SBC. I.e.: clc lda #%0100 0000 ;Decimal 64 adc #%0100 0000 ;Decimal 64 will set the overflow flag, since a carry is generated from bit 6 to 7. The range for standard 2nd complement numbers in 8 bit is [-128,127], so the overflow flag detected that we can't represent 64+64=128 in 2nd complement. clc lda #%1000 0000 ;-128 adc #%1111 1111 ;-1 will not set the overflow flag, but rather the carry flag. Conclusion: The overflow flag is used in connection with the carry flag to establish whether the result of the addition/subtraction can be represented as a valid 2nd complement number in 8 bits. Of course, the overflow flag is multiplexed and used in connection with the BIT command for a totally different purpose. Furthermore the 1541 uses a feature of the 6502 as a ingenious timinginstrument, where hardware changes the overflow bit when a byte is ready to be read/writen to the disk. This means that the command loop bvc loop takes care of business. ------------------------------- Subj : Direct threaded inner interpretors From : brmcf@utkux1 Paul van Loon wrote: > Just a while back I did a FORTH on my C64. To get some speed out of it, I made it a subroutine threaded FORTH. Basically you compile an extra $20 (JSR) with every address you compile. Yeah: no matter how you swing it, the speed improvement in subroutine threaded code is hard to argue with in a 6502 machine. You only pay an extra byte overhead per compiled word and since NEXT is a one byte RTS instead of a two byte address (or execution token), you don't start paying the extra space overhead until the fourth word compiled. > The only problem I encountered was with FORTH words that used the return stack, e.g. DO ... LOOP. You have to manage the processor stack yourself to get this done, but this is a lot less efficient than a simple PHA or PLA. Exactly. I found this too cumbersome. Plus, if you aim to have a small number of concurent tasks in preemptive multitasking, you need to partiion the 256 byte hardware stack among the tasks; if you want to use it in a C64, you also loose whatever you allocate to the keyboard buffer -- say 16 bytes to keep it even. If you want to have 4 tasks with pre-emptive multi-tasking, that would only be 30 addresses deep in each partition, which isn't a whole lot to play to many explicit R-stack games. And at the speed of the 1MHz 6510 in a C64, anything but stack partitioning is too slow and memory consuming for pre-emptive multitasking. > An alternative solution may be to have an extra 'RETURN' stack, that handles words like 'I'. If you want you can make 'R>' and '>R' to work on this stack or at the processor stack. But I think that using R> and >R to change the thread of execution is bad programming anyway. So do I, but since I elect to make the >R and R> the pseudo R-stack that is used for I, I have a vested interest in not using the 'Rack' for flow of execution. Here's my approach: dedicate the X register to the current task index. The Y-register is used as a general index register, and is never maintained across routines. To make more space available on the zero page, the top of stack is at the top of the stack: this costs about four cycles for most low level words, but makes drop faster which often makes up for it. With a psuedo Return stack, you won't have to manipulate the stack pointer, so only the task switcher needs to use TXS and TSX, and the task switcher is 'aware' of which task is which, so it can afford to use the X-register. PLUS LDY OPNDX,X ; zero-page CLC LDA TOS LO,Y ; 16-bit ADC SCND LO,Y ; TOS LO+1 STA SCND LO,Y ; LDA TOS HI,Y ; 16-bit ADC SCND HI,Y ; TOS- HI+1 STA SCND HI,Y ; INY ; seperated STACK LO & STACK HI permits One INC STY OPNDX,Y ; RTS TO R LDY OPNDX,X LDA TOS LO,Y STA TEMP,X LDA TOS HI,Y INY STY OPNDX,X LDY RNDX,X DEY STA RACK-HI,Y LDA TEMP,X STA RACK LO,Y STY RNDX,X RTS Another point raised in the past couple of weeks comes to mind: R@ will work here, but it will not be possible to fetch from the R stack with a pointer, since RACK-LO nand RACK-HI do not live next to each other in memory: they are allocated with either 128 or 256 bytes in each stack partition, for 32 or 64 deep racks Up to 256 bytes total in each stack partition; ---------------------- Boundary: xx00 address RACK-HI#3 RACK-HI#2 RACK-HI#1 RACK-HI#0 ---------------------- Boundary: either xx70 or xx00 address RACK-LO#3 RACK-HI#2 RACK-HI#1 RACK-HI#0 ---------------------- Boundary: xx00 address So for several reasons, I like the parts of the ANS standard that support independence from particular architectures. ---------------------------------- Stable Rasters From : A.BOOSE@LDB.han.de LORDTYM@news.delphi.com (LORDTYM@DELPHI.COM) writes: >The $d011 register is often referred to as the 'magic' register. That's because you can control the status "Bad Line Condition" (BLC) with the 3 LSB of this register. Since the BLC triggers the video-counter (VC) and the row-counter (RC), you can get control of the 2 most important registers inside the VIC-II, deciding what data is displayed. >I am beginning to think this is because nobody really knows the how or why it works. Why not? Once you understood how the VIC-II displays a normal frame, you can easily explain any of what you called 'tricks'. It seems that many people don't know how the VIC-II generates the 40*25 color screen or the 320*200 bitmap. Let me try to explain it... The data the VIC displays in every dot-clock cycle has be fetched out of the system's RAM. It needs one byte per machine cycle (phi), so the C64's bus is split in phi1 when the VIC-II fetches its data and phi2 when the CPU is allowed to own the bus. So all phi1 bandwidth (in the display area) is used to fetch data out of the char ROM/RAM. The problem is, the VIC-II needs additional information which characters/colors should be displayed. This information is fetched in any bad line in phi2 while the CPU is stopped. It's stored in a 40*12 bit char/col buffer called video-matrix. The information which characters/colors should be displayed is supposed to be renewed every eight lines, so we need a BLC every eight lines. Therefore the status "BLC" is set if the screen is on and $30<=$d012<$f8 and the 3 LSB of $d011 and $d012 are equal. As $d012 counts the raster lines of a frame the 3 LSBs are equal any 8 lines and BLC is set. If BLC is set, the VIC-II generates phi2 DMA between X=0 and X=$13f. The VIC-II knows what char to be displayed in the next phi1 cycle, but it also has to know which line of the char has to be displayed. Therefore it has a 3 bit wide row-counter (RC) which is set to 0 during BLC and(!!) X=0 and it counts up to 7 in the next 7 lines. The video-counter supplies the address used to load the video-matrix during phi2 DMA. This counter is split in two 10 bit wide latches, the base (VCB) and the counter (VCC). Together with the 4 MSB of $d018 the VCC generates the VIC-II addresses during the phi2 DMA. At the beginning of a frame VCB is cleared to 0. At X=0 of any line VCB is copied to VCC. VCC is counted up during the 40 display cycles. Only if RC is 7 and X=160, VCC is copied back to VCB (increasing the base). Technically that's all. All $d011-LSB-effects can be reduced as consequences of the statements above. The key thing is to toggle the BLC status for several cycles ( = changing the the 3 LSB of $d011). For example: FLD: Change the 3 LSB of $d011 that they will never match with $d012. You will never get a BLC, nothing (VIC-II Address $3fff) is displayed. Linecrunch: Let the 3 LSB of $d011 match $d012, but change the 3 LSB just before X=0 is reached. RC remains not set (7), no phi2 DMA to load the video matrix, VCC is counted up by 40 and is copied to VCB on X=160. A whole char line is crunched to one raster line. FLI: After a 'normal' bad line let the 3 LSB of $d011 match $d012 after X=0. Due to BLC the VIC-II performs a phi2 DMA again, but the RC isn't set to 0. Changing $d018 from raster line to raster line will provide independent colors 1&2 on any raster line. Note that the address/data drivers of the VIC-II remain off in phi2 during the first 3 BA=low cycles, the VIC reads $ff instead. The color code of the $ffs is the low nibble of the opcode coming right after the $d011 access that caused the forced DMA. So maybe the left 3 chars are lost. >I wonder if some hacker didn't just stumble across the properties of this register. The actual code does not appear to be timed except to start at exactly the same place each time. The inc $d011 instructions appear to be the timers and will wait until the current raster line is finished. Without seeing the context it's hard to say. But you can use $d011 to synchronize to the raster line, just set BLC during X=0..$13f, the 'forced' phi2 DMA will synchronize the CPU. ----------------------------------- Sample 1x1 Scrolly Code From : xmikex@eyrie.stanford.edu Someone asked me for sample scrolly code, and then a few others asked me, and so on, etc. Anyways, I post sample code here so that a good thread will emerge regarding scrolly coding techniques. In other words, tear this lame bit of code with your minds and make it better or simply walk us through the best scrolly (monster scrolly) codes there are on C64/128. It's up to you. (This code works on C64 and C128 (40 cols) with no modification). org $3000 - a fake op...tells the assembler where we want the code to start This fake op (pseudo op) can vary from assembler to assembler. LDA #$00 - --------------------------------------------------- STA $FD - All this is just so that we can do neato ind. index LDA #$32 - addressing later.. ie., we are setting up the STA $FE - spot where scrolly text will be pulled from ! mainloop LDX #$C7 ; Prepare the scrolly register for a "hard right" bitloop STX $D016 ; shove the "hard right" value into scrolly register TXA ; Preserve X by shoving onto A PHA ; and shove A into the stack (stack = temp storage) ;(we need to clear X cuz we gonna use it in a delay loop) ;(if we don't introduce a delay, the text will scroll by ; WAY TOO FAST to READ ! ! Incredible, this 1 Mhz bugger ; machine of ours is, eh?) - delays not set in stone, play ; with them :) higher, lower, etc... LDX #$05 ; just a delay loop - ldx counted down by dex waitloop LDY #$FF ; a delay loop within a delay loop wl2 DEY ; decrement y in our delay looop BNE wl2 ; keep counting y down until y = 0 DEX ; decrement x in our top delay loop BNE waitloop ; repeat the y loop until x = 0 PLA ; grab X from the stack and shove it onto A TAX ; copy A into X (X = #$C7 from mainloop) DEX ; ok, now we countdown X CPX #$BF ; coarse scroll anyone? BNE bitloop ; back to bit pushing LDX #$00 ; clear X rt LDA $7C1,x ; scroll the last screen line one char to the left STA $7C0,x ; we are shuffling one char into another INX ; increment X CPX #$28 ; Hex $28 = Decimal 40 = width of VIC-II screen BNE rt ; still in coarse scroll? INC $FD - here is where the indexing pays off, with BNE TY - these little lines all we gotta do is stick INC $FE - our scrolly-text data beginning (in this case) ty LDY #$00 - at location $3200 LDA ($FD),Y - :)))) STA $7E7 - shove the data into rightmost corner JMP mainloop - let it scrooooolll, baby! (end of loop) $3200 is 12800 in decimal. If I wanted to put some character data there, I could just poke it in, as follows: POKE 12800,65 = A at $3200 POKE 12801,66 = B at $3201 etc ------------------------------